Goto

Collaborating Authors

 upper-confidence frank-wolfe


Fast Rates for Bandit Optimization with Upper-Confidence Frank-Wolfe

Neural Information Processing Systems

We consider the problem of bandit optimization, inspired by stochastic optimization and online learning problems with bandit feedback. In this problem, the objective is to minimize a global loss function of all the actions, not necessarily a cumulative loss. This framework allows us to study a very general class of problems, with applications in statistics, machine learning, and other fields. To solve this problem, we analyze the Upper-Confidence Frank-Wolfe algorithm, inspired by techniques for bandits and convex optimization. We give theoretical guarantees for the performance of this algorithm over various classes of functions, and discuss the optimality of these results.


Reviews: Fast Rates for Bandit Optimization with Upper-Confidence Frank-Wolfe

Neural Information Processing Systems

This is a very interesting paper at the intersection of bandit problems and stochastic optimization. The authors consider the setup where at each time step a decision maker takes an action and gets as a feedback a gradient estimate at the chosen point. The gradient estimate is noisy but the assumption is that we have control over how noisy the gradient estimate is. The challenge for the decision maker is to make a sequence of moves such that the average of all iterates has low error compared to the optima. So the goal is similar to the goal in stochastic optimization, but unlike stochastic optimization (but like bandit problems) one gets to see limited information about the loss function.


Fast Rates for Bandit Optimization with Upper-Confidence Frank-Wolfe

Berthet, Quentin, Perchet, Vianney

Neural Information Processing Systems

We consider the problem of bandit optimization, inspired by stochastic optimization and online learning problems with bandit feedback. In this problem, the objective is to minimize a global loss function of all the actions, not necessarily a cumulative loss. This framework allows us to study a very general class of problems, with applications in statistics, machine learning, and other fields. To solve this problem, we analyze the Upper-Confidence Frank-Wolfe algorithm, inspired by techniques for bandits and convex optimization. We give theoretical guarantees for the performance of this algorithm over various classes of functions, and discuss the optimality of these results.